130 research outputs found

    MPF: A portable message passing facility for shared memory multiprocessors

    Get PDF
    The design, implementation, and performance evaluation of a message passing facility (MPF) for shared memory multiprocessors are presented. The MPF is based on a message passing model conceptually similar to conversations. Participants (parallel processors) can enter or leave a conversation at any time. The message passing primitives for this model are implemented as a portable library of C function calls. The MPF is currently operational on a Sequent Balance 21000, and several parallel applications were developed and tested. Several simple benchmark programs are presented to establish interprocess communication performance for common patterns of interprocess communication. Finally, performance figures are presented for two parallel applications, linear systems solution, and iterative solution of partial differential equations

    Parallel discrete event simulation: A shared memory approach

    Get PDF
    With traditional event list techniques, evaluating a detailed discrete event simulation model can often require hours or even days of computation time. Parallel simulation mimics the interacting servers and queues of a real system by assigning each simulated entity to a processor. By eliminating the event list and maintaining only sufficient synchronization to insure causality, parallel simulation can potentially provide speedups that are linear in the number of processors. A set of shared memory experiments is presented using the Chandy-Misra distributed simulation algorithm to simulate networks of queues. Parameters include queueing network topology and routing probabilities, number of processors, and assignment of network nodes to processors. These experiments show that Chandy-Misra distributed simulation is a questionable alternative to sequential simulation of most queueing network models

    Visualizing Parallel Computer System Performance

    Get PDF
    Parallel computer systems are among the most complex of man's creations, making satisfactory performance characterization difficult. Despite this complexity, there are strong, indeed, almost irresistible, incentives to quantify parallel system performance using a single metric. The fallacy lies in succumbing to such temptations. A complete performance characterization requires not only an analysis of the system's constituent levels, it also requires both static and dynamic characterizations. Static or average behavior analysis may mask transients that dramatically alter system performance. Although the human visual system is remarkedly adept at interpreting and identifying anomalies in false color data, the importance of dynamic, visual scientific data presentation has only recently been recognized Large, complex parallel system pose equally vexing performance interpretation problems. Data from hardware and software performance monitors must be presented in ways that emphasize important events while eluding irrelevant details. Design approaches and tools for performance visualization are the subject of this paper

    Performance analysis integration in the Uintah software development cycle

    Get PDF
    ManuscriptThe increasing complexity of high-performance computing environments and programming methodologies presents challenges for empirical performance evaluation. Evolving parallel and distributed systems require performance technology that can be flexibly configured to observe different events and associated performance data of interest. It must also be possible to integrate performance evaluation techniques with the programming paradigms and software engineering methods. This is particularly important for tracking performance on parallel software projects involving many code teams over many stages of development. This paper describes the integration of the TAU and XPARE tools in the Uintah Computational Framework (UCF). Discussed is the use of performance mapping techniques to associate low-level performance data to higher levels of abstraction in UCF and the use of performance regression testing to provides a historical portfolio of the evolution of application performance. A scalability study shows the benefits of integrating performance technology in building large-scale parallel applications

    05501 Abstracts Collection -- Automatic Performance Analysis

    Get PDF
    From 12.12.05 to 16.12.05, the Dagstuhl Seminar 05501 ``Automatic Performance Analysis\u27\u27 was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

    A 3D Finite-Difference BiCG Iterative Solver with the Fourier-Jacobi Preconditioner for the Anisotropic EIT/EEG Forward Problem

    Get PDF
    The Electrical Impedance Tomography (EIT) and electroencephalography (EEG) forward problems in anisotropic inhomogeneous media like the human head belongs to the class of the three-dimensional boundary value problems for elliptic equations with mixed derivatives. We introduce and explore the performance of several new promising numerical techniques, which seem to be more suitable for solving these problems. The proposed numerical schemes combine the fictitious domain approach together with the finite-difference method and the optimally preconditioned Conjugate Gradient- (CG-) type iterative method for treatment of the discrete model. The numerical scheme includes the standard operations of summation and multiplication of sparse matrices and vector, as well as FFT, making it easy to implement and eligible for the effective parallel implementation. Some typical use cases for the EIT/EEG problems are considered demonstrating high efficiency of the proposed numerical technique

    Kernel-Level Measurement for Integrated Parallel Performance Views: the KTAU Project

    Full text link
    The effect of the operating system on application perfor-mance is an increasingly important consideration in high performance computing. OS kernel measurement is key to understanding the performance influences and the interre-lationship of system and user-level performance factors. The KTAU (Kernel TAU) methodology and Linux-based framework provides parallel kernel performance measure-ment from both a kernel-wide and process-centric perspec-tive. The first characterizes overall aggregate kernel per-formance for the entire system. The second characterizes kernel performance when it runs in the context of a partic-ular process. KTAU extends the TAU performance system with kernel-level monitoring, while leveraging TAU’s mea-surement and analysis capabilities. We explain the rational and motivations behind our approach, describe the KTAU design and implementation, and show working examples on multiple platforms demonstrating the versatility of KTAU in integrated system / application monitoring. 1

    Bacatá: A Language Parametric Notebook Generator (Tool Demo)

    Get PDF
    \u3cp\u3eInteractive notebooks allow people to communicate and collaborate through a single rich document that might include live code, multimedia, computed results, and documentation, which is persisted as a whole for reproducibility. Notebooks are currently being used extensively in domains such as data science, data journalism, and machine learning. However, constructing a notebook interface for a new language requires a lot of effort. In this tool paper, we present Bacatá, a language parametric notebook generator for domain-specific languages (DSL) based on the Jupyter framework. Bacatá is designed so that language engineers may reuse existing language components (such as parsers, code generators, interpreters, etc.) as much as possible. Moreover, we explain the design of Bacatá and how DSL notebooks can be generated with minimum effort in the context of the Rascal meta programming system and language workbench.\u3c/p\u3

    Performance Evaluation of Adaptive Scientific Applications using TAU

    Get PDF
    Fueled by increasing processor speeds and high speed interconnection networks, advances in high performance computer architectures have allowed the development of increasingly complex large scale parallel systems. For computational scientists, programming these systems efficiently is a challenging task. Understanding the performance of their parallel applications i
    • …
    corecore